Optimal rotation control for a qubit subject to 
continuous measurement 
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Abstract — In this article we analyze the optimal control strat- 
egy for rotating a monitored qubit from an initial pure state to 
an orthogonal state in minimum time. This strategy is described 
for two different cost functions of interest which do not have 
the usual regularity properties. Hence, as classically smooth cost 
functions may not exist, we interpret these functions as viscosity 
solutions to the optimal control problem. Specifically we prove 
their existence and uniqueness in this weak-solution setting. In 
addition, we also give bounds on the time optimal control to 
prepare any pure state from a mixed state. 



due to this the uniqueness of the solutions of these equations 
is not guaranteed. Hence our objective in this article is to 
obtain the optimal control strategies for these quantum control 
problems while dealing with these degeneracies. 

In the sections that follow, we consider the control problems 
arising from both these costs and indicate their solution using 
the dynamic programming approach. Further, we also point 
out the need for the interpretation of these solutions by a 
generalized solution framework. 
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I. Introduction 

It is anticipated that devices which make use of quantum 
effects will have a strong impact on future technology [1], [2]. 
This motivates research on the time optimal procedure for the 
preparation of any desired state for monitored open quantum 
systems. For pure states, state preparation can be performed by 
unitary control whose purpose is to 'rotate' the resulting pure 
state to a target (pure) state as fast as possible. A significant 
amount of work has been done on this problem for closed 
quantum systems [3], [4], [5], [6]. At present little research has 
been undertaken for the corresponding open system proble 
Thus there is a need to consider more general cases of the 
time optimal state preparation problem for control theoretic 
and application driven reasons. 

In this article we first consider the time optimal "unitary 
control" for a qubit undergoing continuous measurement. To 
measure the speed of convergence, we examine two different 
cost functions for this unitary control stage. The first is the 
mean of the times at which each trajectory attains the target 
(i.e. the expectation of the stopping time, which is the first 
passage time to the target state), termed the mean hitting 
time, and the other is the time at which the ensemble average 
of trajectories reaches attains the target state termed the 
expected trajectory hitting time. The Hamilton-Jacobi-Bellman 
equations that arise in these cases turn out to be degenerate; 
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'One exception is Ref. [7] where a time optimal control problem for 
monitored open quantum systems was considered for a special case where 
measurement does not provide information about the system. In this situation 
the quantum trajectory becomes a linear quantum trajectory; which enabled 
the authors of [7] to formulate and then solve the system as a linear quadratic 
type optimal control problem. 



A. The problem 

Consider a quantum bit, called a qubitQ An arbitrary qubit 
state, denoted by the 2x2 matrix p, is a positive operator 
in Hilbert space with the constraint that its trace is one. In 
the Dirac notation, pure states - those states with Tr Jp 2 ] = 
1 - are denoted by a complex vector \ip) = {a, 0) such 
that \a\ 2 + \(3\ 2 = 1. The entire state space of qubits can be 
represented as a ball of radius one, called a Bloch sphere, 
where the pure states lie on the surface and the mixed states 
- those states with Tr [p 2 ] < 1 - are in the interior. The 
(x, y, z) axis of this sphere correspond to the directions of the 
eigenvectors of the Pauli matrices where k £ {x,y,z}Q 
The states in the z direction are usually denoted by |0) = 
(1, 0) T (or up in the z direction) and |1) = (0, 1) T (or down 
in the z direction). The matrix form for pure states is obtained 
by taking the outerproduct of the pure state vector: \ip)(ijj\. 

Consider a quantum bit, subjected to continuous weak 
measurement of the Hermitian operator a z (z component of 
angular momentum) and feedback. The goal of the feedback is 
to take the initial state 1 0) and control it to the orthogonal state 
|1) in a time optimal manner. Intuitively, this requires a control 
rotation about the Y axis. See Fig. Q] for a representation of 
this control problem on the Bloch sphere [1]. 

The model for such a system is given by the Stochastic 
Master Equation (SME) [8], [9], [2]: 



dp 



idt ^a(t)[a y , p] 



2~fdtV[a z }p 



(1) 



The measurement strength 7 determines the rate at which 
measurement extracts information about the observable a z . 
In the equation above: a(-) denotes the control signal 

2 Physically qubits can be realized as two level atoms, see [1, Chap. 7] for 
more details and examples. 

3 The Pauli matrices are: a x = [0,1; 1,0], a y = i[0, — 1; 1, 0], cr z = 
[1,0; 0,-1]. 

4 Physically this could arise from applying classical time dependent fields 
to a quantum system, which is standard in open loop quantum control. 
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Fig. 1 . The Bloch sphere with a graphical depiction of our control problem. 
We start in the plus eigenstate of the observable u z and rotate to the orthogonal 
state — \z). The controlled rotation axis is out of the page. 



denotes the adjoint of an operator; dW is the innovation 
process [10]; [A, B] is the commutator; and T> [A] p = ApA^ — 
\{A^Ap + pA^A), H[A]p= Ap + pA^ - Tr [(A* + A)p] p 
are superoperators [2]. 

From Eq. (fl} we may calculate the stochastic differential 
equations (SDE) for the Bloch components using the relation 
dk = Tr [dp<7k] (where k € x,y, z). Next we assume that the 
available control is equal in strength (isotropic) about all axes 
(x,z,y). This drastically simplifies the problem, as we can 
now exploit this symmetry to consider control about just one 
axis (the y axis), so the dynamics are restricted to a single 
plane (the x-z plane). We further simplify the problem by 
transforming to polar coordinates z — RcosO, x = Rs'm6, 
where 9 = tan (x/z)' 1 , by applying the Ito rules on the 
equations for the Bloch components. For initially pure states 
(i.e., Tr [p 2 ~\ = 1) the Bloch vector is confined to the surface 
of the sphere, i.e. R = y/ x 2 + z 2 = 1 and thus our control 
problem reduces to the stochastic differential equation (SDE) 
for an angle: 



d9 = ot(t) dt - 2-/ sm{29)dt - 2 1/27 sm(9)dW. (2) w h e re 



where 9 € [— 7r, it]. 

The first term in Eq. © is the control signal applied. To 
ensure that the control problem is well posed we apply a 
bounded strength control, i.e. the controls are constrained to a 
compact set V := [— f2, f2]. In addition we require that > 2j 
(for reasons required in Thm. |2.2l ); Note that this is a sufficient 
condition but may not be necessary. The class of piecewise 
continuous control signals that take up their values from V 
is denoted by V. This set V is the set of signals which are 
progressively measurable with respect to the filtration of the 
random process. The second term in Eq. © represents the 
measurement back action. The final term is the innovation 
term arising from measurements. 

The solution to the SDE Eq. © at any time t G [to,oo) 
starting from a point 9q (at a time to) and using a control 
strategy a £ V, is denoted by 9(t;a,to,9o). Note that this 
is a random variable whose value depends on an underlying 
sample space. If the arguments used in this expression are 
clear from context, we represent the solution at time t using 
the simplified notation 9 t . 

II. TWO OPTIMAL CONTROL PROBLEMS 

We consider two ways to formulate the cost function for 
time optimal rotation. The first possible formulation is the 



expected hitting time i.e. the expectation of the times at which 
a trajectory hits the target state T (say ±7r). The second 
formulation, is the shortest time at which average or expected 
trajectory first reaches T - termed the expected trajectory 
reaching time. We obtain the Hamilton-Jacobi-Bellman equa- 
tions to be solved for these problems. Numerical solutions to 
this HJB equation can be obtained via standard techniques 
such as the value iteration methods. 

A. Mean (discounted) hitting time 

The original problem of interest is to determine the average 
hitting time to the target angle of w. However it turns out 
that the Hamilton-Jacobi-Bellman equation that arises from 
this optimal control problem is a degenerate elliptic PDE. 
As the existence of classical solutions to this equation are 
not guaranteed, we formulate an alternate cost function - 
one for which the existence and uniqueness of generalized 
(weak) solutions can be rigorously shown. This modified cost 
is the expected discounted hitting time to the target set % ■= 
±7r. Thus the objective of the control problem is to control 
the system in order to minimize the expected discounted 
mean time to hit the target. We note that the solution to 
this problem is potentially different from the solution to the 
original un-discounted problem. The proofs of the uniqueness 
and existence of the viscosity solutions to the undiscounted 
case would require results applicable to degenerate HJB PDEs 
specific to the undiscounted cost function - a result that we 
are currently unaware of. 

Consider the optimal cost function defined over the set G := 
(— 7r, 7r), which has the form: 



S(9 ) 



inf Ji(6 ,v), 



(3) 



Ji(0o,v) 



E 



,(0o) 



exp {—As} ds 



(4) 



rf(0 o ) :=inf{t|0(t;«,O,0 o ) € T} . 

The parameter A (> 0) in Eq. © is called a discount factor. 
In order to obtain the optimal control strategy to this problem, 
we apply the dynamic programming method [11], [12] from 
optimal control theory. This yields the associated Hamilton- 
Jacobi-Bellman equation, for the cost function S, of the form 



sup {- 



l + A^-£^](y)} = 0, VyeG 



(5) 



with boundary conditions <f>(7~ e ) = 0. The differential operator 
L v [4>](y) in Eq. (5) is defined as 



L"[0](y) :=b(y, 



89 



1 



e=y 



d 2 (j) 
Off 2 



(6) 



and is understood as the generator of the Ito diffusion pro- 
cess Eq.©. The coefficients b, er may be obtained from the 
relevant SDE Eq. © which has the form dx — h(x, v)dt + 
<r(x)dW . Hence, we find that h(x,v) := v — 27sin(2ir) and 
<x(6>) := 2 v / 27~sin(0). Note that Equation © (with A = 0) 
takes the form of a degenerate elliptic PDE irrespective of 
whether A is > or = 0; this is because the second order 
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partial derivative term er(-) is zero at some points in the 
domain. Hence the positivity condition on the coefficient of 
the second order derivative, which is sufficient for classical 
solutions to this equation to exist, does not hold [10], [13] [3- 
Physically this degeneracy corresponds to the presence of a 
symmetry of revolution around that point. In order to analyze 
this situation we study the solution to this problem via the 
notion of generalized (viscosity) solutions to this PDE. The 
desired uniqueness and existence result for the hitting time 
function proceed as follows. 

1 ) Viscosity solution for a discounted hitting time problem: 
It turns out that the HJB equation associated with this optimal 
control problem has the form: F(x, u, Du, D 2 u) = 0, with 
boundary conditions <p(T) = 0. For the system and cost 
function under consideration, F(-) is defined as 



F(x,u,p,M):= -47[sin(x)] 2 M+ 



sup 



a — 2 j sin(2a;) 



p + Xu-1 



(7) 



The second term on the right hand side of the expression 
above is termed the Hamiltonian and can be represented by 
a function of the form H(x,u,p). Note that in this case, the 
HJB equation is a degenerate non-linear elliptic PDE hence 
classical solutions may not exist. Therefore it is necessary to 
understand the solution to this equation in a weak / viscosity 
sense. 

2) Existence and uniqueness of the viscosity solution: 
The following result yields the uniqueness of the hitting time 
function. 

Theorem 2.1: The value function is the unique continu- 
ous viscosity solution of the HJB equation: — ^a 2 (x)D 2 u + 
H(x,u, Du) — 0, x G G, in G with boundary condition 
u(dG) = 0. 

Proof: This result follows from Theorem 4.1 in [15]. ■ 

B. Expected trajectory reaching time 

The cost function to be minimized for this problem takes 
the form 

J 2 {6 ,v) : =inf{i|E[0(t;u,O,0o)] G T} 

= inf{t||E[0(t;«,O,0 o )]|=7r}. (8) 

The optimal cost function in this case is: 



R(0 O ) := inf J 2 (9 ,v). 



(9) 



Now, the form of the HJB equations for the cost function above 
with terminal constraints on the expectation E[ ] has a form 
that does not permit analytical analysis. This is due to the fact 
that as the underlying SDE, Eq. (O, is nonlinear. Consequently 
the fundamental quantity of interest E[0(r; a, to, 6q)] does 
not appear to have a closed form solution. Nevertheless it is 
possible to solve the control problem for the optimal strategy 
by formulating an associated problem, whose solution leads to 
the solution of Eq. (0. This is achieved as follows. Consider 
the problem of determining the control policy that maximizes 

5 Unsurprisingly, the sufficiency condition for the Fokker-Planck equation 
[14] to have a smooth solution also does not hold. 



|E[0(T; a, to, do)} I at a fixed time T. This can be easily seen to 
be equivalent to choosing the maximum of the terms M 1 (6' ) 
and M 2 (0o) where these functions arise from the optimal 
control problems: 



Mi(0o) := supE[0(T;a,to,0o)], 

aeV 



M 2 (6 



inf E[O(T;a,t o ,0 o )]. 

aeV 



(10a) 
(10b) 



Formally both of these are Mayer type optimal control prob- 
lems (with zero running cost, fixed terminal penalty and a 
fixed time horizon T > 0). Specifically, the equations above 
have the form: 



Mi(0 o ) =S? T {9o) = 5upE[6(T;v,t,6 )] 
vev 



supE 

tiev 



Ods 



M 2 (0 O ) = S* T (9 ) = su P E[- 6{T;v,t,6 ) 
vev 



supE 

vev 



0ds-9 T 



(11) 



(12) 



We now describe the result that links these cost functions 
S a , S b , with the original cost function of interest R (Eq. (O). 

Theorem 2.2: 

R(9 ) = inf {T\ max{S a jT ((? ),S fc T (t9o)} > n} . (13) 
Proof: We begin by noting that the RHS of Eq. ( 1131 can 
be written as follows 

max{sup[£((?(T; v, 0, 9 ))},sup[E(-9{T; v, 0, 9 ))}} 

vev vev 



= sup[max(£;(c?(T; v, 0, 9 Q )), - E(9(T; v, 0, O )))] 
vev 

^sup[\E[9(T;v,0,9 )}\} 

vev 



(14) 



Hence the statement of this theorem is equivalent to demon- 
strating that 



{T\ w£{t v \\E[6(t v ;v,0,9 ) 
vev 



>tt}<T} 



= {T\ S up[\E[9(T;v,0,9 a )]\] > ir}, (15) 

vev 

as the desired result follows immediately from this (by taking 
an infimum over both sets). Thus the proof proceeds via two 
steps. 

Stage 1: We demonstrate that the LHS D RHS i.e., any 
element T of the set on the right also belongs to the set on 
the left. 

Given T x e {T\ sup„ eV [| E[9(T; v, 6 )]\] > tt} it follows that 



3 Wl eVs.t \e[9{T 1 ;v 1 ,0,9o)]\>tt 



(16) 



Therefore inf veV {t v \ | E[9{t v ; v, 0, O )] | > tt} < T t }. This 
implies that 



T\ G {T\ mi{t v \\E[9(t v ;v,0,9 Q )}\ >ir}<T}, 
vev 

thereby proving the first part. 

Stage 2: We demonstrate that the RHS D LHS. 



(17) 



4 



Let Ti G {T\mi veV {t v \\E[d(t v ;v,0,6 )]\ > tt} < T}. 
Hence 



3*1 < 7l, Wl € V S.t I^Cti^i.O^o)]! =7T. 



(18) 



Without loss of generality we consider the case 
E[0(ti;vi,0,6o)] = tt. The alternative case follows almost 
directly. Now, by applying a control +Q for all * > t\ we 
have from the solution of the SDE that: 

E[d(T,v,0,0 o )]- E[8(t uVl ,0,8 )] 
= E \j t [ u (*)- 2 7sin(26»(*))]d*|. (19) 

Note that due to the assumption on the control signal bound 
(fi > 27), it follows that 



E[9{T, v, 0, 0o)] - E[6{tuv u 0, 9 )} > 0, 



(20) 



Hence T > *i and therefore sup veV [E[9(T; v, 0, 9 )}] > tt, 
leading to the fact that T G RHS. This proves the second 
part. From the two stages above, the result follows. ■ 
This result indicates an approach to obtain the desired cost 
function R; We first determine the solution to the control 
problems for <Sg T , Sq T given by Eqns. ( II 11 1. dT2b for a 
particular value of T. Then we can use progressively smaller 
values of this terminal time T, in order to obtain the desired 
value of R via Eq. ( 113b . Now, to obtain the solutions to the 
optimal control problems S?y we apply the dynamic 
programming method to yield the Hamilton-Jacobi-Bellman 
equation (HJB). For the sake of brevity, we describe the 
procedure for S^, since the approach for <SK then follows 
immediately. Given any function^] <j> G C^ 2 {g) where Q := 
[0, T] x M we formulate the associated HJB equation for the 
optimal control problem ( II U as 



sup 

•uGV 



Of 



+ L v [4>](y)\ =0, Vy G R, *G[0,T]. (21) 



The differential operator L"[0](y) in the equation above is 
defined as before. This HJB equation is solved over the domain 
ML Note that this domain is different from [—tt, tt] since the 
objective in the control problem is to maximize the final angle. 
Furthermore, we observe that at the point 8 = the control 
function is non-unique due to the symmetric nature of the 
problem. Hence there would exist two alternatives for the 
optimal control at 8 = 0. We omit this analysis for the sake of 
brevity. The boundary condition for the PDE in Eq. ( I21l i is: 



H T i v) = y, Vj/ e 



(22) 



As this HJB equation is degenerate parabolic we indicate 
the existence and uniqueness of the corresponding viscosity 
solution as follows. 

1 ) Viscosity solution for a finite time horizon problem: Con- 
sider a system evolving according to the dynamics dx(s) = 
f(s,x(s),v(s))ds + a(x(s))dw(s), for s G [to,T]. The ex- 
pected trajectory hitting time problem for this system can be 

6 the notation denotes C 1 in time and C 2 in space. 



recast as a optimal control problem over a finite time horizon, 
with a cost function of the form 



V(t,0 o ) := supE 
vev 



L(s,x,v) ds + V(9(T)) 



(23) 



The system dynamics and the value function are defined 
over the cylinder g := [t ,T) x MP. The HJB equation 
associated with this optimal control problem has the form 

dV 

-—+H{t,x,D x V,D x 2 V)=0, (t,x)eQ (24) 
ot 

where H(t,x,p, A) := 

sup < — /(*, x, a)p — 7£<j 2 (x)A — L(t, x, a) > (25) 

with the boundary condition V(T,x) = Va: G M n . 

The following result helps ensure the desired properties of 
the viscosity solution. 

Theorem 2.3: The value function Eq. ( 123b is the unique, 
uniformly continuous viscosity solution to the HJB equation 



Proof: The result follows from [13, Ch. V §9 and 
Theorem 9.1]. ■ 

III. Simulations 

In this section we describe numerical solutions and sim- 
ulation results for the problems described in Sections IH-AI 
III-BI We use a numerical approximation to value iteration 
approach to obtain the solution to the PDEs in Eqns. (O, (121b . 
In this method, the stochastic nature of the system dynamics 
is captured as the transition probabilities of an approximating 
Markov chain model. The transition probabilities are then 
used form an iterative scheme that converges to yeild the 
value function i.e., the optimal cost function. For a detailed 
description of this approach and applications to quantum 
systems we refer readers to [16], [17]. 

By this approach we obtain both the optimal cost function 
as well as the optimal control strategy. These are depicted 
in Fig. |2] for the case of the mean hitting time; as the 
corresponding plots for the expected trajectory look very 
similar we do not plot them. The optimal control strategy 
in order to rotate from |0) to |1) is to set the Hamiltonian 
control v to +Q and this is consistent with the intuitively 
correct, approach, viz., in order to reach an angle of tt as 
soon as possible we must apply the maximum control in that 
direction. 

In Fig. [3]we depict the mean hitting time to ±tt, for the sys- 
tem described by Eq. (f2j, as a function of the control strength. 
Interestingly there is a only a small difference between be- 
haviour of the mean hitting time and the expected trajectory 
hitting time which disappears as £1 — > 00. In this limit the 
mean hitting time asymptotes to the Hamiltonian evolution 
hitting time as does the expected trajectory hitting time. This 
is unlike the results obtained in the rapid purification literature 
where Wiseman and Ralph's [18] protocol outperforms Jacobs 
[19] protocol by a factor of two in the regime of strong control 
(the limiting case) [20]. The simulations in the Fig. [3] were 
obtained via standard Monte Carlo simulations. 
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Fig. 2. (a) The mean hitting time to ±7r starting from 9(0) = 0, with 
Q = 10, 7=1 and A = 3. (b) the optimal control function for the same 
parameter values. Note the point 8 = at which the cost function is not 
differentiable in the spatial term. This reaffirms the fact that the optimal cost 
function is not a classical solution of the HJB equation for our problems. 
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Fig. 3. The discounted mean hitting time to 7r as a function of the control 
strength SI and 7 = 1. In all simulations the discounting factor is A = 0.1. 
The squares with error bars are the results of stochastic simulations (ensemble 
size is 5000). The solid line is a plot of the solution to Eq. (5) i.e., the optimal 
average hitting time. The dashed line is the time at which the discounted 
Hamiltonian evolution hits ir. The dotted line is Hamiltonian evolution hitting 
time. 



We have also considered the case of optimal rotation control 
between different initial and final states. For example the dis- 
counted hitting time between ±\x) (where ±|x) are eigenstates 
of a x ) for weak control is less than the time between ±\z). 
This is also true for the expected trajectory hitting time. The 
intuition for this effect is at 8 — {0, ±7r} Eq. (ffj is 'like' an 
attractor because the measurement projects the state into an 
eigenstate of the measured observable. 

IV. Bounds on the optimal solutions for state 

PREPARATION FROM A MIXED STATE 

In this section we obtain a bound on the minimum time 
taken to evolve from a maximally mixed state to any de- 
sired pure state. We first bound the mean hitting time to 
prepare the target state specified by the pair (8, P), where 
P = Tr [p 2 ] is its purity, starting from the maximally mixed 
state (i.e. P = 1/2). By monitoring the system continu- 
ously a purity P = 1 — e is achieved, on average, in time 
t wr (P) = (a/2-P - 1 tanh"V2P- IV87, (using the 
Wiseman-Ralph protocol which is time optimal in the mean 
hitting time sense [20], [21]). To obtain an upper bound on 
the time taken to rotate this eigenstate to any other state we 
consider the worst case: the target state is orthogonal to the 
initial state. For |0| 3> I7I we bound the rotation time by 
r r (8), i.e., the solution of Eq. (0). For e <C 1 we may write 
the upper bound on the optimal time for state preparation as 



Wb — twr{P) + T r (0), where this bound is understood in 
the mean hitting time sense. 

In the case of the minimum time for the expected trajectory, 
Jacobs' purification scheme [19] is time optimal [20]. However 
the control rotations were instantaneous impulse control which 
is different from our finite strength description. Nevertheless 
we calculate a lower bound on the minimum time using 
his result tj(P) = —ln(2P—2)/8j. Using the worst case 
scenario the bound on the rotation time is t r (8), i.e., the 
solution of Eq. ||8). Consequently the lower bound on the 
realistic time optimal protocol is Ilb — tj(P) +t r (8), which 
is understood in the expected trajectory hitting time sense. 
We obtain an upper bound by assuming that, in the worst 
case, the optimal bounded strength control performs better than 
the Wiseman-Ralph protocol i.e., t\yr(P) provides an upper 
bound on the true optimal time t*(P). In this case the upper 
bound is t UB = t WR (P) +t r (8). Thus t LB < t*(9,P) < 

t(JB- 

V. Conclusion and Open Problems for Future 
Work 

In continuing with our investigation of weak solutions there 
is a need for deeper analysis to prove the existence and 
uniqueness in the limit of the discount factor tending to 
zero. Moreover our work here suggests an exciting possibility: 
of being able to improve the optimal control strategy by 
using stages of pure Hamiltonian evolution (where we turn 
the measurement off) and other periods where we use both 
measurement and feedback. 
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